-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nozomi UDI connector #1656
Nozomi UDI connector #1656
Conversation
Nozomi UDI connector
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #1656 +/- ##
===========================================
- Coverage 86.52% 86.44% -0.08%
===========================================
Files 600 596 -4
Lines 51314 51200 -114
===========================================
- Hits 44397 44260 -137
- Misses 6917 6940 +23 ☔ View full report in Codecov by Sentry. |
Nozomi sample results
Nozomi JWT token session expired exception handled
}, | ||
{ | ||
"key": "ipv6-addr.value", | ||
"object": "src_ip" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this mapping will create both ipv4-addr and ipv6-addr. you need a transformer to check the ip version before creating the object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stix-shifter framework does check internally for the ipv4-addr & ipv6-addr pattern matches. Hence transformer is not required.
}, | ||
{ | ||
"key": "ipv6-addr.value", | ||
"object": "dst_ip" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same problem as ip_src
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stix-shifter framework does check internally for the ipv4-addr & ipv6-addr pattern matches. Hence transformer is not required.
{ | ||
"key": "network-traffic.src_ref", | ||
"object": "nt", | ||
"references": "mac_addr_src" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it replaces the src_ref that is created by ip_src. you may need to create separate network traffic object if there are two sources such as ip and mac
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ip and mac addresses are from same source. We have observed that sometimes in alert only one of these attributes is filled with the value otherwise it is null. Hence we have added the references in both cases.
|
||
|
||
class ChainNameValue(ValueTransformer): | ||
"""A value transformer to add default 'kill_chain_name' as 'mitre-attack' with obj""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add input/output values in the comment for all the transformers? its easier for maintenance in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the comment for input/output values.
Added input output samples for all the transformers
stix_shifter_modules/nozomi_vantage/stix_transmission/connector.py
Outdated
Show resolved
Hide resolved
stix_shifter_modules/nozomi_vantage/stix_transmission/connector.py
Outdated
Show resolved
Hide resolved
stix_shifter_modules/nozomi_vantage/stix_transmission/connector.py
Outdated
Show resolved
Hide resolved
stix_shifter_modules/nozomi_vantage/stix_transmission/connector.py
Outdated
Show resolved
Hide resolved
stix_shifter_modules/nozomi_vantage/stix_transmission/connector.py
Outdated
Show resolved
Hide resolved
if data: | ||
data = Connector.get_results_data(data) | ||
if metadata: | ||
return_obj['data'] = data if data else [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Page_size is static, so this can overflow above total_records length.
You can just do
data[0: total_records]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From CLI we can give custom offset values. If user gives offest=10 and length =1000, then we need to fetch 10th to 1010th record. We can't use offset always 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supporting offsets with metadata would be quite odd and I can't really think of a use case where this would be necessary.
As a note, this comment was originally made with the idea of a page_size that does not change. Hence the need to account for a data list that could be greater than the requested amount. For example, If your page_size is 99 and the length is 100, then you could get a resulting data of size 198 (98 more than requested). You'd need to trim it down to 100 (or the total_records, which should usually be 0+length when using metadata).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all the batch iterations offset and length is used to get the requested records. Please check the latest code.
stix_shifter_modules/nozomi_vantage/stix_transmission/connector.py
Outdated
Show resolved
Hide resolved
stix_shifter_modules/nozomi_vantage/stix_transmission/connector.py
Outdated
Show resolved
Hide resolved
stix_shifter_modules/nozomi_vantage/stix_transmission/connector.py
Outdated
Show resolved
Hide resolved
stix_shifter_modules/nozomi_vantage/stix_transmission/connector.py
Outdated
Show resolved
Hide resolved
lang.json api_page_size description updated. result_count removed from metadata. calculating result_count using page_size and page_number.
page_number = int(metadata.get('page_number', 1)) | ||
# overwrite the offset for current batch. | ||
offset = abs(offset - ((page_number - 1) * self.api_client.api_page_size)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand on this? I'm not really sure what this does.
If my offset is 0, this becomes larger over iterations. If my offset is 4,000, this becomes smaller over time and eventually starts to increase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
offset passed from the function parameter is not directly used for slicing.
after calculating offset for the current batch, It will be used for slicing.
if the offset is 0 page_number will start from 1.
if the offset is 4000 page_number will come as 5 from the metadata.
It will not start form 1 because already 4 pages are processed.
example 1: offset starts from 0.
for the first batch
offset=0 length=2000 pagesize=1000 page_number=1
offset = abs(0 - ((1 - 1) * 1000))
offset for the current batch is 0.
for the second batch
offset=2000 length=2000 pagesize=1000 page_number=3
offset = abs(2000 - ((3 - 1) * 1000))
offset for the current batch is 0.
for the third batch
offset=4000 length=2000 pagesize=1000 page_number=5
offset = abs(4000 - ((5 - 1) * 1000))
offset for the current batch is 0.
example 2: offset does not start from 0.
for the first batch
offset=500 length=2000 pagesize=1000 page_number=1
offset = abs(500 - ((1 - 1) * 1000))
offset for the current batch is 500.
for the second batch
offset=2500 length=2000 pagesize=1000 page_number=3
offset = abs(2500 - ((3 - 1) * 1000))
offset for the current batch is 500.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide some context on why this needs to be done? What happens if my offset is less than the page_number * pagesize?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the pagination as per the suggestion.
offset = int(offset) | ||
length = int(length) | ||
result_count = offset | ||
total_records, page_number, offset = self.handle_metadata(offset, length, metadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clean up some of these variable names? We have length, result_count, expected_result_count, total_records and local_result_count. It's difficult to follow what these are actually used for.
You could add it as a comment if you'd like to keep the names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline comments are added.
|
||
return_obj = self.handle_data(data, metadata, offset, total_records, return_obj) | ||
return_obj = self.handle_data(data, offset, expected_result_count, return_obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are you handling partial pages.
For example,
Page size = 100
length = 101
What happens when it tries to get that remaining result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example,
consider offset=0, length=101, pagesize=100
It will make two API calls.
From the first page, will fetch 100 records.
From the second page, will fetch 100 records.
from the 200 records first 101 records will be sliced.
Totally, it will return 101 records.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens with the meta data in this case. What will the next query be when it goes to get the next 101 records.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the pagination as per the suggestion.
Inline comment are added.
try: | ||
offset = int(offset) | ||
length = int(length) | ||
# result_count is overall record count from all the previous batches processed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems off. Why is offset being used to hold the total record count? We discussed before that there is a way to do this without worrying about the total number of results we've received.
I understand that what you are trying to handle is the remaining data from a partially handled page. IE
length = 80 page_size = 100.
The results will have a remainder of 20.
The easiest way to do this is to simply persist how much of the page is processed as META_DATA. For example, in this above scenario, you return back
{"page":1, "page_index":80}.
Then on the next call you return back
{"page":2, "page_index":60}
Then
{"page":3, "page_index":40}
You are keeping track of both the current page, and where you were at in that page with the metadata. The first query should handle the offset and subsequent ones should just use the page + page_index.
I don't question that this implementation doesn't work, but I am concerned that it's relying a bit too much on the caller behaving a certain way. Relying on a simple page + page_index implementation is straight-forward and easier to understand. After the first call, we just ignore offset and return length number of records.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the pagination as per the suggestion.
Hide the api_page_size field in user interface.
31a9b9c
into
opencybersecurityalliance:develop
Nozomi UDI connector